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INTRODUCTION 


Automatic speech recognition (ASR) can be used to score oral 
reading fluency (ORF) assessments to ameliorate current 
inadequacies (e.g., administration errors, nigh opportunity 
cost), and en. an aaron part of a |< r soluti 

a] ORF. But more research is sani on how 
ASR ene for diverse student groups. 


The purpose of this study is to examine the accuracy of ORF 
scores as generated by ASR compared to humans, and in 
particular, differential effects for students with disabilities 
(SWD) and those receiving English language (EL) supports. 


Research Questions 


1. Are the agreement rates of ORF word scores between the 
humans and ASR lower for SWDs or ELs? 

2. Are the differences in ORF WCPM between humans and ASR 
exacerbated for SWDs or ELs? 


Sample. The total sample size was N = 650 students. 


Garces “ade 2, N = 153 Grads 3, N é 182 Grades N= 315 
Ee 
Female 67 (44%) 79 (43%) 116 (37%) 
Male 73 (48%) 64 (35%) 118 (37%) 
Missing 13 (8.5%) 39 (21%) 81 (26%) 
Ethnicity 
Hispanic/Latino 28 (18%) 26 (14%) 41 (13%) 
Not Hispanic/Latino 112 (73%) 117 (64%) 193 cee 
Missing 13 (8.5%) 39 (21%) 81 (26%) 


Students with a Disability (SWD) 


Yes 21 (14%) 11 (6.0%) 33 (10%) 
No 119 (78%) 132 (73%) 201 (64%) 
Missing 13 (8.5%) 39 (21%) 81 (26%) 


English Learners (EL) 


Yes 17 (11%) 12 (6.6%) 17 (5.4%) 
No 123 (80%) 131 (72%) 217 (69%) 
Missing 13 (8.5%) 39 (21%) 81 (26%) 


Statistics presented: n (%) 


METHOD 


The following R (i 


packages were used: 


Automatic speech 
recogntion was Less 


accurate scoring words for 
>) SOsrerm opbhemelel-memmbassacseler— 


in scoring was mitigated 
when reading scores were 
aresregated for passages. 


Human-ASR Word Score Agreement Rate 


Grade 2 Grade 3 Grade 4 


40. ..96 


Agreement Rate 
3 


0.00 


Non SWD EL SWD EL Non SWD =e SWD BL Non SWD EL SWD =e 
SWD/EL Missing Missing SWD/EL Missing Missing SWD/EL Missing Missing 


RESULTS 


RQ 1: We fit mixed-effect generalized linear models (GLM) for 
each grade with random effects for student and passage, and 
regressed the word score agreement rate (proportion of words 
scored correct or incorrect by both the human and ASR for 
each student reading) on disability and EL status. 


Across Grades 2 to 4, the ORF word score agreement rates 
between human criterion and ASR were significantly lower 
for SWDs compared to their non-SWD/non-EL peers. There 
was no such difference for EL students . 


Results of Word Score Agreement Rate Mixed-Effect GLMs, by Grade 
Grade 2 Grade 3 Grade 4 
Estimate SE z-value p-value Estimate SE z-value p-value Estimate SE z-value p-value 


Fixed Effects 


Intercept’ 2.55 0.08 30.86 > .001 3.05 0.07 43.89 >.001 3.20 0.09 34.99 > di 
SWD-Missing iol Os Cn OAS) -0.60 .551 =158)} OSs -1.56 118 -0.11 0.08 Slecoll P59 
SWD -0.85 0.20 -4.26 > .001 -0.94 0.19 -4.85 > .001 -0.48 0.14 =3.39 > .001 
EL-Missing - = = - Ika OS 1.41 LSS = = - - | 
EL -0.52 0.22 -2.41 .016 -0.40 0.19 -2.10 -035 -0.09 0.25 -0.35 25 

| Random Effects” . 
Seeueee : on - - - ioe - - - 0.66 - - - 


Students 1.04 - = - 0.93 - - = 0.98 - - - 


‘The intercept represents non-SWD and non-EL students. 


?Estimates reflect the standard deviations of the random effects. 


RQ 2: We fit mixed-effect linear models for each grade with 
random effects for student and passage, and regressed WCPM 
difference score (the human criterion score minus the ASR 
score) on disability and EL status. 


The differences in ORF WCPM scores between human and ASR 
were not exacerbated for SWD or EL students. 


Grade 2 Grade 3 Grade 4 


Estimate SE t-value Estimate SE t-value Estimate SE _ t-value 


| Fixed Effects 

| Intercept’ 4.52 0.83 5.42 3.96 0.59 6.74 4.76 0.71 6.73 
SWD-Missing “O82 Ashe -0.42 4.06 8.62 0.47 =O8YA ILO -0.83 

| SWD -1.02 2.06 -0.50 0.24 1.72 0.14 -0.85 1.43 -0.59 

‘EL-Missing a = -4.25 8.53 -0.50 = si 

| EL -2.30 2.22 -1.04 1.48 1.67 0.89 -2.08 2.26 -0.92 


Random Effects” 


Passages 2221 = = 1.64 = = 3.39 = = 


Students 10.59 = - 8.04 - - 8.68 - - 
Residual 8.10 - - 8.28 - - 8.40 - - 


‘The intercept represents non-SWD and non-EL students. 


“Estimates reflect the standard deviations of the random effects. 


Conclusion. We speculate that the ASR may be less accurate 
than a human scorer for SWDs at the word level, but the 
difference in scoring is mitigated when scores are aggregated 
at the passage level. 
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